User Modelling on Semi - Structured Documents *

نویسندگان

  • Sanghee Kim
  • Wendy Hall
  • Andy Keane
چکیده

We present a new approach that makes use of the embedded structural information of the documents that a user frequently refers to for deriving a personalized concept hierarchy and for identifying user preferences concerning document searching and browsing. In contrast with conventional methods that ignore the distribution of structural elements, our approach accepts semantic clues defined in the structural tags, so that it takes full advantage of the textual and structural information. Formal concept analysis theory is applied to define semantic relationships among the concepts, and reinforcement learning is employed to unobtrusively adapt to individualized information needs. Given a user’s query, the personal ontology and user model maximize their knowledge bases to present the relevant documents in a prioritized order, one which the user prefers to browse. We demonstrate the practicability of our approach through two experiments. The former showed significantly improved performance results compared to that of a flat vector model. The second experiment compared the performance accuracy of the structure elements exploited by four users and thereby demonstrated the diversity of user preferences.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Modelling the Webspace of an Intranet

Searching the internet using the currently available search engines is not satisfactory. The techniques used there focus on the extraction of relevant information directly from the documents available on the web. We introduce a new approach, which aims at describing the content of a webspace, formed by a collection of related documents, instead of looking at the single documents. By identifying...

متن کامل

Learning from Labeled Features for Document Filtering

Existing document filtering systems learn user profiles based on user relevance feedback on documents. In some cases, users may have prior knowledge about what features are important. For example, a Spanish speaker may only want news written in Spanish, and thus a relevant document should contain the feature“Language: Spanish”; a researcher focusing on HIV knows an article with the medical subj...

متن کامل

Towards an Adaptation of Semi-structured Document Querying

In our research work, we consider that access to semi-structured documents is carried out by a data-oriented query. With different users and a same query, the returned results are always the same although users’ characteristics (interests, preferences, etc.) may be different. In order to solve this problem and to offer a personalized access to semi-structured documents, our objective is to impr...

متن کامل

Knowledge Extraction from Semi-structured Data Based on Fuzzy Techniques

In this work we propose a fuzzy technique to compare XML documents belonging to a semi-structured flow and sharing a common vocabulary of tags. Our approach is based on the idea of representing documents as fuzzy bags and, using a measure of comparison, evaluating structural similarities between them. Then we suggest how to organize the extracted knowledge in a class hierarchy, choosing a techn...

متن کامل

Architectural approach for handling semi-structured data in a user-centred working environment

Purpose of this paper Today the amount of all kind of digital data (e.g., documents and e-mails), existing on every user’s computer, is continuously growing. Users are faced with huge difficulties when it comes to handling the existing data pool and finding specific information respectively. We aim to discover new ways of searching and finding semi-structured data by integrating semantic metadata.

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2002